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ing the initiation codon from the cDNA library. Moreover, 
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clones thus selected above. 
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Description 

Technical field 

5 f 00 ^ 1 r^,? 6 PreSent if1Venti0n be '° n9S 10 the fie,d of 9enetic engineering, and relates to a method for screening full- 
length cDNA clones. 

Background Art 

io [0002] Recently, genome projects targeting various animals, plants, and microorganisms have been in progress 
Numerous genes have been isolated and their functions are under investigation. In order to efficiently analyze the func- 
tions of .solated genes, it is important to efficiently obtain cDNA clones capable of expressing complete proteins that 
is, full-length cDNA clones. 

[0003] The fallowings are known as methods for constructing a full length-enriched cDNA library: the oligo cappina 
is method .n wh.ch an RNA linker is enzymatically bound to Cap of mRNA (Sugano & Maruyama. Proteins, Nudeic Acids 
and Enzymes, 38: 476-481. 1993. Suzuki & Sugano. Proteins. Nucleic Acids and Enzymes, 41: 603-607 1996 M 
Maruyama and S. Sugano. Gene. 138, 171-174, 1994); the modified oligo capping method developed by combining the 
oligo capp.ng method with Okayama-Berg method (S. Kato et al.. Gene, 150, 243-250, 1 994, Kato & SeWne Unexam- 
ined Published Japanese Patent Application (JP-A) NO. Hei 6-153953. published June 3, 1994); and the linker chemi- 

?T! 0° * 3 DNA " nker fe bound 10 Ca P < N - M erenkova and D. M. Edwards, WO 96/34981 Nov. 7 

!n«?'J ! ~ CaP chemical modifica t'°' 1 method by biotin modification of Cap (P. Caminci et al., Genomics, 37 327-336' 
1 996. P. Carn.nc. et al.. DNA Research. 4, 61 -66, 1 997). TTiese are all methods to modify Cap of eukaryotic mRNA and 
o prepare a full length-enriched cDNA library. A known method for constructing a full length-enriched cDNA library by 
trapping Cap is the method using Cap-binding proteins derived from yeast or Hela cells for labeling a 5'-caD site fl 

25 Edery et aL. MCB, 15^ 3363-3371 . 1995). Also known is Cap Finder (Clontech) that is the Cap Switch ofigonucleotide 
method in which the Cap Switch oligonucleotide is annealed by C-tailing the 5' end of a first strand cDNA 
[0004] A cDNA library constructed by these methods is rich in full-length cDNAs compared to that obtained by the 
conventional methods. However, incomplete-length clones are also contained to some extent. To efficiently analyze the 
functions of genes and to efficiently clone novel useful genes, development of methods for easily confirming whether 

30 each clone contained in a cDNA library is full-length or not has been desired. 

Disclosure of the Invention 
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[0005] An objective of the present invention is to provide a method for eff iciently screening full-length cDNA clones 
and a method for constructing a full length-enriched cDNA library. 

!T 6 Lk TUTf 6nt invent0rS hme 10 achieve *» above °* e8v ' 8 a" 0 " contemplated efficiently screening 
full-length cDNAs from acDNA library by the presence or absence of a translation initiation codon as an index based 
on the fact that a cDNA deficient in a certain 5' region is likely to lack a translation initiation codon. whereas a full-length 
cDNA contains an initiation codon. Especially, the inventors assumed that a full-length cDNA could be efficiently 
screenedfrom a cDNA library constructed by a method for preparing a full length-enriched cDNA library. Specifically 
thejnventors thought that full-length cDNA clones could be efficiently isolated by constructing a cDNA library by a 
method for preparing a full length-enriched cDNA library, determining several hundreds of base pairs of a DNA nucle- 
otide sequence from the 5- end. and analyzing the presence or absence of an initiation codon in this region to screen 
the clones containing initiation codons. 

[0007] However, few programs for predicting an initiation site of cDNA have been developed (eg "A. G Pedersen 
Proceedings of fifth international conference on intelligent systems for molecular biology, p226-233 1997 held in 
Haltodiki Greece, June 21-26. 1997). Though some programs for exons prediction have been developed ("Gene 

S io'^ ^ eta '- NUdeiC *** ReS " * 2 ' 5156 " 516 3. 1994. "Grail" Y. Xu etal.. Genet-Eng-N-Y, 16. 241- 
253. 1994), an initiation site cannot be accurately determined relying solely on these programs 

[0008] The present inventors have developed a program for cDNA initiation codon prediction by themselves and 
determined nucleotide sequences of the 5'-region of clones contained in a cDNA library constructed by a method for 
preparing a full length-enriched cDNA library to examine whether an initiation codon exists in this 5'-region using this 
software program. * 

[0009] More specifically, a full length-enriched cDNA library was constructed by the oligo capping method and 
nucleotide sequences of the 5'-regions of some clones contained in the cDNA library were determined. Based on the 
determined sequences, the clones were divided into known and novel ones through a database search. The presence 
or absence of an mrt.at.on codon and its location in the determined nucleotide sequences of the 5'-regions were judged 
using the .nrt.ation codon prediction program. For the known clones, whether the location of the initiation codon recog- 
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nized by the initiation codon prediction program coincides with that of the initiation codon in databases is examined. 
Indeed, the presence or absence and location of the initiation codon in the known clones predicted by the program coin- 
cided with the information in the databases. 

[001 0] Thus, the software program developed by the present inventors can accurately recognize the presence or 
5 absence of an initiation codon and its location, and full-length cDNA clones can be efficiently screened by selecting the 
clones that are recognized to contain an initiation codon by the program from the cDNA library. Moreover, a cDNA 
library extremely rich in full-length cDNAs can be constructed by combining the screened clones. 
[001 1] The present invention relates to a method for screening full-length cDNA clones from a cDNA library and a 
method for constructing a full-length cDNA library by combining cDNA clones screened by the screening method. More 
io specifically it relates to: 

(1) A method for isolating a full-length cDNA clone, the method comprising: 

(a) determining a nucleotide sequence from the S'-region of a cDNA clone contained in a cDNA library, 

is (b) determining the presence or absence of an initiation codon in the nucleotide sequence determined in (a) 

using an initiation codon prediction program, and 
(c) selecting clones recognized as containing the initiation codon in (b); 

(2) The method of (1), wherein the cDNA library is constructed by a method for preparing a full length-enriched 
20 cDNA library, 

(3) The method of (1), wherein a cDNA library is constructed by a method comprising a step of modifying Cap of 
mRNA; 

(4) A method for constructing a full length cDNA library, the method comprising: 

25 (a) determining a nucleotide sequence from the 5'-region of a cDNA clone contained in a cDNA library, 

(b) determining the presence or absence of an initiation codon in the nucleotide sequence determined in (a) 
using an initiation codon prediction program, 

(c) selecting clones recognized as containing the initiation codon in (b), and 

(d) combining the clones selected in (c); 

30 

(5) The method of (4), wherein the cDNA library is prepared by a method for constructing a full length-enriched 
cDNA library; 

(6) The method of (4), wherein the cDNA library is constructed by a method comprising a step of modifying Cap of 
mRNA; and 

35 (7) A cDNA library obtainable by the method of (4). 

[001 2] The present invention is based on the inventors' findings that full-length cDNA clones can be efficiently iso- 
lated by analyzing nucleotide sequences of the 5-region of cDNAs in a cDNA library, specifically a full length-enriched 
cDNA library, by using a software program for accurately predicting a translation initiation codon, and a full length- 

40 enriched cDNA library can be constructed by combining the isolated cDNA clones. The method for screening full-length 
cDNA clones by the present invention comprises (a) determining a nucleotide sequence from the S'-region of a cDNA 
clone contained in a cDNA library, (b) determining the presence or absence of an initiation codon in the determined 
nucleotide sequence using an initiation codon prediction program, and (c) selecting clones recognized as containing 
the initiation codon. The method for constructing a full-length cDNA library of the present invention comprises, in addi- 

45 tion to above steps (a) to (c), step (d) of combining the screened clones. 

[001 3] In the method of the present invention, a "cDNA clone" whose nucleotide sequence of the 5'-region is to be 
determined is not particularly limited. Full-length cDNAs cannot be efficiently isolated from clones derived from a library 
not rich in full-length cDNAs, compared with clones derived from a full length-enriched cDNA library. Therefore, a cDNA 
clone is preferably derived from a library constructed by the above -described methods for preparing a full length- 

50 enriched cDNA library, including, for example, the oligo capping method in which an RNA linker is enzymatically bound 
to Cap of mRNA (Sugano & Maruyama. Proteins, Nucleic Acids and Enzymes, 38: 476-481, 1993, Suzuki & Sugano, 
Proteins, Nucleic Acids and Enzymes, 41: 603-607, 1996, M. Maruyama and S. Sugano, Gene, 138. 171-174, 1994), 
the modified oligo capping method developed by combining the oligo capping method with Okayama-Berg method (S. 
Kato et at, Gene, 150, 243-250, 1994, Kato & Sekine, JP-A-Hei 6-153953, June 3, 1994), the linker chemical-binding 

55 method in which a DNA linker is chemically bound to Cap (N. Merenkova and D. M. Edwards, WO 96/34981 Nov. 7, 
1 996), the Cap chemical modification method in which Cap is modified with biotin (P. Carninci et al., Genomics, 37, 327- 
336, 1996, P. Carninci et al., DNA Research, 4, 61 -66, 1997), the method using Cap binding proteins drived from yeast 
or Hela cells (I. Edery et al, MCB, 15, 3363-3371, 1995), or a library prepared by Cap Finder using Cap Switch oligo- 
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nucleotide method. 



K E FR^ifl™ , 'sorted from a cDNA library by standard methods described in. for example. J. Sam- 

mmsi a ? J ■ ° ^ C ' 0n,n9, SeC0nd Edition ' Co,d SP*0 Harbor Laboratory Press. 1989 

* , L, nucleot, de sequence can be determined from the S'-region of a clone by. for example standard methods 

ST 8 DN \ S ^ ncin9 , rea9entS and 3 DNA Sequencer available from Biosystems. etc. X Ste nucSe 

ST A" C '° ne d0Se ^ hB " t0 be detefmined - and determini "9 about 1 -000 ^.eotides fromte 5 e^d IS 

S be ^ eCt6d by determini " 9 ab ° Ut 500 nuC,eo,ides - — abou * 3 °o nuJS»£r 

22. ^ IC*? ° n ^ Pr6diC,i0n program " used for a "«««**» sequence from the T^S on 0 f a 

clone ,s preferaWy the program developed by the present inventors as described in Example 1 beZ 4 p 2 e ^ce or 

t£p^ 

he program. A cDNA clone with a high score, recognized as containing an initiation codon in the determined sequence 

n S2£TT^ 3 ,U "' len9th CDNA ' ■ Nto 006 " Mh 3 l0W m reC09nized as not oirtrtCSlSSSS 
SiSS^Z^SS^" in ^ lete -' en ^ 'DNA. Thus, a ful.-length cDNA can be efficiSy isoSS 
Dy screeninga cDNA from a cDNA library, judged as containing an initiation codon in the nucleotide sequence Indeed 

™ tS P t !2 ° 51% ^ ^ t0 SCree " C ' 0neS (the hi9hest score « ° *• content of fulMe^n doni 
among the screened clones was 71% when clones showing a score of 0.5 or higher were selected 77% wS Ta s^ 

cLni "T^ 81% ^ ha ^of0.80 or higher, and 85% with asccreofoJor^^'J.mZ^^l 
*> ExaXlM " 3 ^ aCCUraCy bV SeleC,in9 d ° neS "* W9h ^ «*« ,he « 

EH? n^H"?™"' ! CDNA librafy re - construct ed by combining clones selected by the method for screenina full- 
.2 C ™* C ' 0nes of the P resent inven «°" te extremely rich in full-length cDNAs compared ^Wh me^renUDNA 

« SS^,^^^!!^? ? yZ ' nS °!? fUnCti0nS C ° ntainin9 3 mixture 0< "P" 88 * P roteins be obtained 
*>s i nis system enables eff raently cloning useful genes. 

Best Mode for Car rying out the Invention 

3 . SL, SSiST i,lustrated in de,ail below reference 10 ,he toM ° *» * « * «» 

Example 1. Preparation of a program for predicting a translation initiation codon of cDNA 



45 



50 



5EX Jl" initiation codon prediction program of the present invention recognizes a putative authentic 

nrtebon codon among all ATGs contained in a given cDNA sequence fragment The pio^SdfcS^S^ 
nformafton on scanty of given regions (several tens to several hundreds base pairs) at ShdSSa ptfSe ATG 

SSSSrSK"- - ? B) ; nforma,i ° n 0,1 8imilarity 0f re ° ions near a PUW« Are to those n2 an £ S 
h rri t*- 0 * SeqUenCeS in 3 franslati0 " al «Btan and regions near an initiation codon are Parted 
22215?" ' ^' 0n ° mmer0US S6qUenCeS Wh0SS tranS,afonal «* "on-translationa. regions hSSS 
£S> P ,? ra h Pred,CtS a " initiati0 " 00dOn based 00 information ^ the ^ve characteristics 

^SLJ? a . d,SC " m, ; an ; a D na,ysis used in Gene ™er, a Program for genomic exon prediction (Jolovyev V 
V Salamov A. A.. Lawrence C. B. Predicting internal exons by oligonucleotide composition and discriminant anTsfe 
of spliceable open readmg frames. Nucleic. Acids Res. 1994. 22: 5156-63). was a^L to SiSSSdSTSS 
Jne* discriminant analysis, information on some characteristics derived from data is dfiS rtS^SSe ST 

" 0r * k 3 ^ fe int ° 9 pr0babilit y of simi,arit y 10 a " initiation codon' Ts T rate 

o corred answers obtamed from data of sequences whose initiation codon has been identified) IpeS y a o^a 

SSt 5 "Hf ^ ^ a " ini,iafi0n COd ° n ° f *■* ATQ contained in a 9 iven c°NA sequence is SS^SS^™ 
or not A threshold vaue ,s established depending on the plan of the Mowing analyses, that is depSng ™ttee*erS 

can be used A parameter of weight is determined so as to maximize the prediction system using data of seauences 

^ ^ identiffed 35 8 traini " 9 datum - ^ ■■*"-»*" * *) and B) werf^SS 
into the following three information and used as information about characteristics. emooaied 
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A) information on similarity of given regions (several tens to several hundreds base pairs) at both sides of a putative 
ATG to translations regions 

[0021] 

5 

1 : a frequency of six nucleotide base letters contained in a sequence from ATG to a stop codon (within 300 bp 
downstream of ATG at longest) 

2: discrepancy of the information on a frequency of six nucleotide base letters contained in 50 nucleotide bases 
upstream and downstream of ATG 
10 3: an index of similarity to a signal peptide [a hydrophobicity index of the most hydrophobic eight amino acids letters 
among 30 amino acids (90 nucleotide bases) downstream of ATG] 

B) information on similarity of regions near a putative ATG to those near an authentic initiation codon 
15 [0022] 

1 : information on a weighted matrix as using three nucleotide base letters in the region from 14 nucleotide bases 
upstream of ATG to 5 nucleotide bases downstream of ATG as a unit 

2) the presence or absence of other ATGs upstream of ATG in a same frame (the presence is 1 and the absence 
20 is 0) 

3: a frequency of cytosine contained in the region from 36 bases upstream of ATG to 7 bases downstream of ATG. 

Example 2: Preparation of cDNA by the oligo capping method and analysis thereof by the program for initiation codon 
prediction 

25 

[0023] A cDNA library was prepared by the oligo capping method and the plasmid DNA was extracted from each 
clone by the standard method. Specifically, mRNA was extracted from human placenta and human cultured cells (Tetra- 
tocarcinoma NT-2 and neuroblatoma SK-N-MC) by the method described in the reference (J. Sambrook, E. R, Fritsch 
& T. Maniatis. Molecular Cloning, Second Edition, Cold Spring Harbor Laboratory Press, 1989). An oligo cap linker 

30 (SEQ ID NO. 1) with an oligo dT adaptor primer (SEQ ID NO. 2) in the case of Tables 1 & 2, or with a random adaptor 
primer (SEQ ID NO. 3) in the case of Tables 3 & 4 were subjected to BAP treatment, TAP treatment, RNA ligation, syn- 
thesis of a first strand cDNA, and removal of RNA according to the methods described in the references (Suzuki & Sug- 
ano, Proteins, Nucleic Acids, and Enzymes, 41 , 603-607, 1996, p606. Y Suzuki et at.. Gene, 200, 149-156, 1997). The 
first strand cDNA was then converted into the double-stranded DNA by PCR, digested with SF/'I, and cloned into vec- 

35 tors, such as pME18SCG, pMFL etc. digested with Dra\\\ in the determined direction (Sugano & Maruyama, Proteins, 
Nucleic Acids, and Enzymes, 38, 472-481, 1993, p480). The obtained DNA was subjected to the sequencing reaction 
using a DNA sequencing reagent (DyeTerminatoir Cycle Sequencing FS Ready Reaction Kit, PE Applied Biosystems) 
following the manual and sequenced with a DNA sequencer (ABIPRISM 377, PE Applied Biosystems). The DNA 
sequence of the 5'-region of each done was analyzed once. 

40 [0024] The presence or absence of an initiation codon in the DNA sequence of each clone was analyzed using the 
developed program for cDNA initiation codon prediction (ATGpr). In this analyzing program, the higher the score is, the 
higher the probability of being an initiation codon is. The maximum score is 0.94. 

(1) Analysis of translation initiation codons in the clones whose open reading frames are known in database among 
45 cDNA prepared by the oligo capping method 

[0025] Among the results for all analyzed clones, the result for the clones that are known to contain the initiation 
codon in the determined sequences in databases (F-NT2RP1000020, F-NT2RP1 000025, F-NT2RP 1000039, and F- 
NT2RP1000046) are shown in Table 1. F-NT2RP 1000020 (880 bp) has 96% identity at nucleotide positions 88 to 690 

so to "human neuron-specific gamma-2 enolase" (GenBank accession No. M22349); F-NT2RP 1000025 (645 bp), 97% 
homology at positions 29 to 641 to "human alpha-tubulin mRNA" (GenBank accession Na K00558); F-NT2RP1 000039 
(820 bp), 96% identity at positions 12 to 820 to "human mRNA for elongation factor 1 alpha subun'rt (EF-1 alpha) (Gen- 
Bank accession No. X03558); and F-NT2Rp1 000046 (788 bp), 97% identity at positions 3-788 to "human M2-type pyru- 
vate kinase mRNA" (GenBank accession No. M23725). The sequences of the 5*-region in these clones are shown in 

55 SEQ ID Nos: 4, 5. 6, and 7. 
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Table 1 





F-NT2RP1000020 


F-NT2RP1 000025 


F-NT2RP1000039 


F-NT2RP1000046 


ATG Na 


Location 
of ATG 


ATGpr 
Score 


Location 
of ATG 


ATGpr 
Score 


Location 
of ATG 


ATGpr 
Score 


Location 
of ATG 


ATGpr 
ocore 


1 


1 


0.05 


96 


(0.94> 


65 


(0.90) 


111 


(0.94) 


2 


162 


<0.84> 


148 


0.13 


154 


0.05 


174 


0.82 


3 


292 


0.05 


193 


0.05 


209 


0.11 


198 


0.19 


4 


313 


0.05 


201 


0.09 


231 


0.05 


300 


0.16 


5 


441 


0.05 


232 


0.05 


321 


0.05 


315 


0.11 


Note 1: < > means translation initiation codon 

Note 2: Location of ATG means the nucleotide base position of ATG in the S'-region of a DNA sequenc 
ATG No. means the number of ATG from the S'-region of a DNA sequence. 


:e. 



[0026] As show in Table 1 . among the cDNA prepared by the oligo capping method, the full-length clones whose 
open reading frames are known in databases, containing initiation codons were accurately recognized by the initiation 
codon prediction program (ATGpr) (coincident with the initiation codons in databases). 

(2) Analysis of initiation codons in the clones whose open reading frames are known in database among cDNA pre- 
pared by the oligo capping method 

[0027] Among the results for the clones analyzed, the results for the clones whose initiation codon is known to 
absent in the determined sequence in databases (F-NT2RP 10000 13. F-NT2RP1000054 r and F-NT2RP 10001 22) are 
shown in Table 2. F-NT2RP1000013 (608 bp) has 97% identity at positions 1 to 606 to "human nuclear matrix protein 
55 (nmt55) mRNA" (GenBank accession No.U89867); F-NT2RP 1000054 (869 bp), 96% identity at positions 1 to 869 
to "human signal recognition particle (SRP54) mRNA" (GenBank accession No. U51920); and F-NT2RP1000122 (813 
bp), 98% identity at positions 1 to 813 to "H. sapiens mRNA for 2-5A binding protein" (GenBank accession No. 
X76388). The sequences of the 5' region of these clones are shown in SEQ ID Nos: 8, 9, and 10. 



Table 2 





F-NT2RP1000013 


FNT2RP 1000054 


F-NT2RP1000122 


ATG No. 


Location of 
ATG 


ATGpr Score 


Location of ATG 


ATGpr Score 


Location of ATG 


ATGpr Score 


1 


21 


0.05 


31 


0.12 


23 


0.07 


2 


27 


0.05 


60 


0.20 


100 


0.05 


3 


32 


0.32 


87 


0.05 


166 


0.05 


4 


56 


0.11 


97 


0.05 


235 


0.06 


5 


119 


0.10 


146 


0.05 


316 


0.05 


6 


125 


0.08 


172 


0.05 


346 


0.05 


7 


141 


0.05 


180 


0.11 


406 


0.05 


8 


155 


0.06 


218 


0.07 


431 


0.05 


9 


161 


0.06 


272 


0.05 


469 


0.06 


10 


176 


0.08 


319 


0.07 


546 


0.12 


11 


203 


0.07 


346 


0.05 


553 


0.05 


12 


290 


0.20 


363 


0.07 


574 


0.05 
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Table 2 (continued) 





F-NT2RP1000013 


FNT2RP 1000054 


F-NT2RP1000122 


ATG No. 


Location of 
ATG 


ATGpr Score 


Location of ATG 


ATGpr Score 


Location of ATG 


ATGpr Score 


13 


311 


0.16 


409 


0.05 






14 


314 


0.12 


480 


0.07 







w [0028] As shown in Table 2, among cDNA prepared by oligo capping method, the initiation codon prediction pro- 
gram (ATGpr) did not recognize by mistake the initiation codons in incomplete-length cDNAs whose open reading 
frames are known in databases and which do not contain any initiation codons. 

(3) Analysis of initiation codons in novel clones among the cDNA prepared by the oligo capping method 

15 

[0029] Among the results for analyzed clones, the results for novel clones that were predicted to contain initiation 
codons (F-ZRV6C1000408, F-ZRV6C 1000454, F-ZRV6C 1000466, F-ZRV6C1000615, and F-ZRV6C1000670) are 
shown in Table 3. The sequences of the 5' region of these clones are shown in SEQ ID Nos: 11, 12, 13, 14, 15. 



Table 3 



F-ZRV6C 1000408 F-ZRV6C 1000454 F-ZRV6C 1000466 



25 


ATG 


Location 


ATGpr 


Location 


ATGpr 


Location 


ATGpr 




No. 


of ATG 


Score 


of ATG 


Score 


of ATG 


Score 




1 


85 


<0.94> 


5 


0.05 


162 


<0.86> 




2 


208 


0.22 


107 


<0.87> 


182 


0.05 


30 


3 


386 


0.05 


153 


0.05 


207 


0.08 




4 


518 


0.11 


201 


0.08 


244 


0.05 




5 


545 


0.05 


211 


0.05 


262 


0.05 


35 


6 






236 


0.07 


303 


0.11 
















(cont'd) 










Table 3 (cont'd) 






40 






F-ZRV6C 10006 15 


F-ZRV6C 1000670 








ATG 


Location 


ATGpr 


Location 


ATGpr 








No. 


of ATG 


Score 


of ATG 


Score 




45 




1 


85 


<0.94> 


120 


<0.94> 






2 


208 


0.26 


187 


0.54 








3 


386 


0.05 


312 


0.06 








4 


518 


0.09 


388 


0.05 




SO 




5 


545 


0.05 


445 


0.05 





Note: o means predicted initiation codon. 



55 

[0030] As shown in Table 3. the predicted initiation codons in F-ZRV6C1000408, F-ZRV6C 1000454, F- 
ZRV6C1000466, F-ZRV6C1000615, and F-ZRV6C 1000670 are "ATG" starting with "A" at positions 85, 107, 162, 85. 
and 120, respectively. Therefore, these clones were judged as full-length cDNA clones. 
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[0031 J In addition, among the results for analyzed clones the results for novel clones predicted as not containing 
initiation codons (F-ZRV6C1001410, F-ZRV6C1001 197, and F-ZRV6C1001472) are shown in Table 4. The sequences 
of the 5* region of these clones are shown in SEQ ID Nos: 16, 1 7 and 18. 



Table 4 





F-ZRV6C1001410 


F-ZRV6C1001197 


F-ZRV6C1001472 


ATG Na 


Location of 
ATG 


ATGpr Score 


Location of ATG 


ATGpr Score 


Location of ATG 


ATGor Score 


1 


23 


0.05 


5 


0.24 


77 


0.25 


2 

Cm 


31 


0.07 


141 


0.25 


126 


0.05 


3 


71 


0.06 


202 


0.05 


149 


0.05 


4 


178 


0.05 


219 


0.05 


194 


0.05 


5 


214 


0.05 


228 


0.05 


213 


0.22 


6 










249 


0.05 


7 










338 


0.09 


8 










344 


0.05 


9 










351 


0.05 


10 










365 


0.05 



[0032] As shown in Table 4, F-ZRV6C1001410, F-ZRV6C1001 197, and F-ZRV6C 100 1472 were recognized as not 
containing initiation codons. These clones were thus judged as incomplete-length clones. 

Industrial Applicability 

[0033] The present invention provides a method for efficiently selecting full-length cDNAs. Clones selected by the 
method of the present invention can express complete proteins. Therefore, the present invention enables efficiently 
analyzing the functions of isolated genes. 
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SEQUENCE LISTING 

<110> Helix Research Institute, Inc. 

<120> Method for screening full-length cDNA clones 

<130> H1-806PCT 

<150> JP 09-289982 
<151> 1997-10-22 

<160> 18 

<170> Patentln version 2.0 

<210> 1 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligo-capping linker sequence 
<400> 1 

AGCAUCGAGU CGGCCUUGOU GGCCHACUGG 

<210> 2 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligo(dT) adapter primer sequence 
<400> 2 

GCGGCTGAAG ACGGCCTATG TGGCCTTTTT TTTTTTTTTT TT 
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<210> 3 
<2U> 32 
<212> DHA 

<213> Artificial Sequence 
<220> 

<223> Randan adapter primer sequence 
<400> 3 

GCGGCTGAAG ACGGCCTATG TGGCCNNNNN NC 
<210> 4 



32 



<211> 880 
<212> DHA 
<213> Homo sapiens 

<400> 4 

ATGCGCCCGC GCGGCCCTAT AGGCGCCTCC TCCGCCCGCC GCCCCGGAGC CGCAGCCGCC 60 

GCCGCCACTG CCACTCCCGC TCTCTCACCG CCGCCGTCGC CACCGCCACC GCCACTGCCA 120 

CTACCACCGT CTGAGTCTGC AGTCCCGAGA TCCCAGCCAT CATGTCCATA GAGAAGATCT 180 

GGGCCCGGGA GATCCTGGAC TCCCGCGGGA ACCCCACAGT GGAGGTGGAT CTCTATACTG 240 

CCAAAGGTCC TTTCCGGGCT GCAGTGCCCA GTGGAGCCTC TACGGGCATC TATGAGGCCC 300 

TGGAGCTGAG GGATGGAGAC AAACAGCGTT ACTTAGGCAA AGGTGTCCTG AAGGCAGTGG 360 

ACCACATCAA CTCCACCATC GCGCCAGCCC TCATCAGCTC AGGTCTCTCT GTGGTGGAGC 420 

AAGAGAAACT GGACAACCTG ATGCTGGAGT TGGATGGGAC TGAGAACAAA TCCAAGTTTG 480 

GGGCCAATCC ATCCTGGGTG TGTCTCTGGC CGTGTGTAAG GCAHGGGCAA CTGAACNGGA 540 

ACTGCCCCTG TATCGCCACA TTGCTCAGCT TGGMCGGGAA CTCARACCTC ATCCTGCCTG 600 

TTGCCGGCCT TCAACGTGAT CAATGGTTGG CTTCTCATGC CTGGCAACAA ANCTGGCCAT 660 

TGCNGGAATT TTCATGATCC TCCCCHTTGG GAAACTGAAA AACTTTCCGG AATGCCCNTC 720 

CAACTAAGTT GCAAAAGGTC TACCNATACC CCCCAAGGGG AATTCCTCCA AGGGAACAAA 780 

TNCCCGGGAA AGGAATGCCC CCCAATTHTT HGGGGGAATA AAAGGTGGGC TnGCCCCCC 840 

CATTTTCCTG GAAAAAACNA TNAAAACCCT TGGGAAACTT 880 

<210> 5 
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<211> 645 
<212> DNA 

<213> Homo sapiens 
<400> 5 

TGTGCGTTAC TTACCTCNAC TCTTAGCTTG TCGGGGACGG TAACCGGGAC CCGGTGTCTC 60 

CTCCTGTCGC CTTCGCCTCC TAATCCCTAG CCACTATGCG TGAGTGCATC TCCATCCACG 120 

TTGGCCAGGC TGGTGTCCAH ATTGGCAATG CCTGCTGGGA GCTCTACTGC CTGGAACACG 180 

GCATCCAGCC CGATGGCCAG ATGCCAAGTG ACAAGACCAT TGGGGGAGGA GATGACTCCT 240 

TCAACACCTT CTTCAGTGAG ACGGGCGCTG GCAANCACGT GCCCCGGGCT GTGTTTGTAG 300 

ACTTGGAACC CACAGTCATT GATGAAGTTC GCACTGGCAC CTACCGCCAG CTCTTCCACC 360 

CTGAGCAGCT CATCNCAGGC AAGGAAGATG CTGCCAATAA CTATGCCCGA GGGCACTACA 420 

CCATTGGCAA GGAGATCATT GACCTTGTGT TGGACCGAAT TCGCAAGCTG GCTGACCANT 480 

GCACCGGTCT TCANGGCTTC TTGGTTTTCC ACAGCTTTGG TGGGGGAACT GGTTCTGGGT 540 

TCACCTCCCT GCTCATGGAA CGTCTCTCAG TTGATTATGG CAAGAAATCC AAGCTGGAGT 600 

TCTCCATTTA CCCAGCACCC CNGGTTTCCN CNGCTGTAHT TNGAA 645 

<210> 6 

<211> 820 
<212> DNA 
<213> Hobo sapiens 

<400> 6 

crrrrrrcGC aacgggtttg ccgccagaac acaggtgtcg tgaaaactac ccctaaaagc 60 

CAAAATGGGA AAGGAAAAGA CTCATATCAA CATTGTCGTC ATTGGACACG TAGATTCGGG 120 

CAAGTCCACC ACTACTGGCC ATCTGATCTA TAAATGCGGT GGCATCGACA AAAGAACCAT 180 

TGAAAAATTT GAGAAGGAGG CTGCTGAGAT GGGAAAGGGC TCCTTCAAGT ATGCCTGGGT 240 

CTTGGATAAA CTGAAAGCTG AGCGTGAACG TGGTATCACC ATTGATATCT CCTTG1GGAA 300 

ATTTGAGACC AGCAAGTACT ATGTGACTAT CATTGATGCC CCAGGACACA GAGACTTTAT 360 

CAAAAACATG ATTACAGGGA CATCTCAGGC TGACTGTGCT GTCCTGATTG TTGCTGCTGG 420 

TGTTGGTGAA TTTGAAGCTG GTATCTCCAA GAATGGGCAG ACCCGAGAGC ATGCCCTTCT 480 

GGCTTACACA CTGGGTGTGA AACAACTAAT TGTCGGTGTT AACAAAATGG ATTCACTGAN 540. 

CCACCCTACA GCCAGAAGAA ATATGANGAA ATTGTTAAGG AAGTCAGCAC TTACATTAAG 600 

AAAATTGGCT ACAACCCCGA CACAGTANCA TTTGTGCCAA TTTCTGGTTG GAATGGTGAC 660 



11 



EP 1 026 242 A1 



AACATGCTGG AACCAAHTGC TAACATGCCT TGGTTCCAGG GATGGAAAAT CCCCCHTTAA 720 

GGATGGCNAT GCCATTGGAA CCCCCCTGCT TGAAGGCTCT GGANTGCATC CTANCACCAA 780 

CTCCTTCAAA TTGAAAAACC CCTTGCHCCC GCCTCCNCCA 840 

<210> 7 

<211> 788 
<212> DNA 
<213> Homo sapiens 

<400> 7 

GAGGCTGAGG CAGTGGCTCC TTGCACAGCA GCTGCACGCG CCGTGGCTCC GGATCTCTTC 60 

GTCTTTGCAG CGTAGCCCGA GTCGGTCAGC GCCGGAGGAC CTCAGCAGCC ATGTCGAAGC 120 

CCCATAGTGA AGCCGGGACT GCCTTCATTC AGACCCAGCA GCTGCACGCA GCCATGGCTG 180 

ACACATTCCT GGAGCACATG TGCCGCCTGG ACATTGATTC ACCACCCATC ACAGCCCGGA 240 

ACACTGGCAT CATCTGTACC ATTGGCCCAG CTTCCCGATC AGTGGAGACG TTGAAGGAGA 300 

TGATTAAGTC TGGAATGAAT GTGGCICGTC TGAACTTCTC TCATGGAACT CATGAGTACC 360 

ATGCGGAGAC CATCAAGAAT GTGCGCACAG CCACGGAAAG CTTTGCTTCT GACCCCATCC 420 

TCTACCGGCC CGTTGCTGTG GCTCTAGACA CTAAAGGACC TGAGATCCGA ACTGGGCTCA 480 

TCAAGGGCAG CGGCACTGCA GAGGTGGAGC TGAAGAATGG AGCCACTCTC AAAATCACGC 540 

TGGATAATGC CTACATGGAA AAGTGTGACG AGAACATCCT GTGGCTGGAC TACAAGAACA 600 

TCTGCAAGGT GGTGGAAGTG GGCAACAAGA TCTACGTGGA TGATGGGCTN ATTTCTCTCC 660 

AGGTGAACAC AAAGGTGCCG ACTTCCTGGG TGACNGAKGT GGAAAATGGT GGCTCCTTGG 720 

GCHCAAGAAA GGTGTGAACT TCCTGGGGCT GCTGTGGAST TGCCTGCTGT GTCNGAAAAA 780 

GACATCCA 78 8 

<210> 8 

<211> 608 
<212> DNA 
<213> Hobo sapiens 

<400> 8 

ACAGCCTGGC TCCTTTGAGT ATGAATATGC CATGCGCTGG AAGGCACTCA TTGAGATGGA 60 

GAAGCAGCAG CAGGACCAAG TGGACCGCAA CATCHAGGAG GCTCGTGAGA AGCTGGAGAT 120 

GGAGATGGAA GCTGCACGCC ATGAGCACCA GGTCATGCTA ATGAGACAGG ATTTGATGAG 180 
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GCGCCAAGAA GAACTTCGGA GGATGGAAGA GCTGCACAAC CAAGANGTGC AAAAACGAAA 240 

GCAACTGGAG CTCAGGCAGG AGGAANAGCG CAGGCGCCCT GAAGAANAGA TGCGGCGGCA 300 

GCAAGAAGAA ATGATGCGGC GACNGCAGGA AGGATTCAAG GGAACCTTCC CTGATGCGAG 360 

AGAGCAGGAG ATTCGGATGG GTCNGATGGC TATGGGAGGT GCTATGGGCA TAAACNACAG 420 

ATGTGCCATG CCCCCTGCTC CTGTGCCAGC TGGTACCCCA GCTCCTCCAG GACCTGCCAC 480 

TATTATGCCG GATGGAACTT TGGGATTGAC CCCACCMCA ACTGAACGCT TTGGTCHGGC 540 

TGCTACNATG GAAKGAATTG GGGCAATTGG TGGAACTCCT CCTGCATTCN ACCGTGCAGC 600 

TCCTGGGA 608 

<210> 9 

<2U> 869 
<212> DNA 
<213> Homo sapiens 

<400> 9 

ATATTAAACT AGTGAAGCAA CTAAGAGAAA ATGTTAAGTC TGCTATTGAT CTTGAAGAGA 60 

TGGCATCTGG TCTTAACAAA AGAAAAATGA TTCAGCATGC TGTATTTAAA GAACTTGTGA 120 

AGCTTGTAGA CCCTGGAGTT AAGGCATGGA CACCCACTAA AGGAAAACAA AATGTGATTA 180 

TGTTTGTTGG ATTGCAAGGG AGTGGTAAAA CAACAACATG TTCAAAGCTA GCATATTATT 240 

ACCAGAGGAA AGGTTGGAAG ACCTGTTTAA TATGTGCAGA CACATTCAGA GCAGGGGCTT 300 

TTGACCAACT AAAACAGAAT GCTACCAAAG CAAGAATTCC ATTTTATGGA AGCTATACAG 360 

AAATGGATCC TGTCATCAIT GCTTCTGAAG GAGTAGAGAA ATTTAAAAAT GAAAATTTTG 420 

AAATTATTAI TGTTGATACA AGTGGCCGCC ACAAACAAGA AGACTCTTTG TTTGAAGAAA 480 

TGCTTCAAGI TGCTAATGCT ATACAACCTG ATAACATTGT TTATGTGATG GATGCCTCCA 540 

TTGGGCAGGC TTGTGAAGCC CAGGCTAAGG CTTTTAAAGA TAAAGTAGAT GTACCTCAGT 600 

AATAGTGACA AAACTTGATG GCCATGCAAA ANGAAGTGGT GCACTCAGTG CAGTCGCTGC 660 

CACAAAAAAT CCGATTATTT TCATTGGTAC AGGGGGAACA TATAHATGAC TTTGAACCTT 720 

TCAAAAACAC AGCCTTTTAT TAACAAACTT CTTGGTATNG GCGACATTGA AAGGACTGAT 780 

AAATAAAGTC CACKAATTGA AATTTGGATG ACKATGMAAA CCCTTATTGA AAAAATTGAA 840 

ACATBGTCCA GTTTTACTTT GCGAAACNT 869 

<210> 10 

<2U> 813 
<212> DNA 
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<213> Homo sapiens 
<400> 10 

GTTGTGGTAT CTGTATTAAG AAATGCCCCT TTGGCGCCTT ATCAATTGTC AATCTACCAA 60 

GCAACTTGGA AAAAGAAACC ACACATCGAI ATTGTGCCAA TGCCTTCAAA CTTCACAGGT 120 

TGCCTATCCC TCGTCCAGGT GAAGTTTTGG GATTAGTTGG AACTAATGGT ATTGGAAAGT 180 

CAACTGCTTT AAAAATTTTA GCAGGAAAAC AAAAGCCAAA CCTTGGAAAG TACGATGATC 240 

CTCCTGACTG GCAGGAGATT TTGACTTATT TCCGTGGATC TGAATTACAA AATTACTTTA 300 

CAAAGATTCT AGAAGATGAC CTAAAAGCCA TCATCAAACC TCAATATGTA GACCAGATTC 360 

CTAAGGCTGC AAAGGGGACA GTGGGATCTA TTTTGGACCG AAAAGATGAA ACAAAGACAC 420 

AGGCAATTGT ATGTCAGCAG CTTGATTTAA CCCACCTAAA AGAACGAAAT GTTGAAGATC 480 

TTTCAGGAGG AGAGTTGCAG AGATTTGCTT GTGCTGTCGT TTGCATACAG .AAAGCTGATA 540 

TTTTCATGTT TGATGAGCCT TCTAGTTACC TAGATGTCAA GCAGCGTTTA AAGGCTGCTA 600 

TTACTATACG ATCTCTAATA AATCCAGATA GATATATCAT TGTGGTGGAA CATGATCTAA 660 

GTGTATTAGA CTATCTCTCC GACTTCATCT GCTGTTTATA TGGTGTACCA AGCGCCTATG 720 

GAATTGTCAC TATGCCTTTT AG7GTTAGAA AAGGCATAAA CNTTTTTTGG ATGGGTATGT 780 

TCCAACAGAA AACTTGANAA TCNNAAATGC NTC 813 

<210> 11 

<2tl> 655 
<212> DHA 
<213> Boao sapiens 

<400> 11 

AGACTCTCAC CGCAGCGGCC AGGAACGCCA GCCGTTCACG CGTTCGGTCC TCCTTGGCTG 60 

ACTCACCGCC CTCGCCGCCG CACCATGGAC GCCCCCAGGC AGGTGGTCAA CTTTGGGCCT 120 

GGTCCCGCCA AGCTGCCGCA CTCAGTGTTG TTAGAGATAC AAAAGGAATT ATTAGACTAC 180 

AAAGGAGTTG GCATTAGTGT TCTTGAAATG AGTCACAGGT CATCAGATTT TGCCAAGATT 240 

ATTAACAATA CAGAGAATCT TGTGCGGGAA TTGCTAGCTG TTCCAGACAA CTATAAGGTG 300 

ATTTTTCTGC AAGGAGGTGG GTGCGGCCAG TTCAGTGCTG TCCCCTTAAA CCTCATTGGC 360 

TTGAAAGCAG GAAGGTGTGC GGACTATGTG GTGACAGGAG CTTGGTCAGC TAAGGCCGCA 420 

GAAGAAGCCA AGAAGTTTGG GACTATAAAT ATCGTTCACC CTAAACTTGG GAGTTATACA 480 

AAAATTCCAG ATCCAAGCAC CTGGAACCTC AACCCANATG CCTCCTACGT GTTTTATTGC 540 

NCAAATGAAA CGGTGCATGG TGTTGANTTT GACTTTATAC CCNATGTCAA GGGAACANTA 600 

CTGGTTTGTG ACATTTTCCT CCAACTTCCT GTCCAANCCA ATTGNATGTT TCCAA 655 
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<210> 12 

<211> 599 
<212> DM 
<213> Homo sapiens 

<400> 12 

AAAGATGCGC AGGCGCCGTG TGGCACTCGG CGGTCGAAAG GGGAGTTCAA GGAGACGGGG 60 

GCGACGCGGC TGAGGGCTTC TCGTCGGGGT CGGGGCTGCA GCCGTCATGC CGGGGATAGT 120 

GGAGCTGCCC ACTCTAGAGG AGCTGAAAGT AGATGAGGTG AAAATTAGTT CTGCTGTGCT 180 

TAAAGCTGCG GCCCATCACT ATGGAGCTCA ATGTGATAAG CCCAACAAGG AATTTATGCT 240 

CTGCCGCTGG GAANAGAAAG ATCCGAGGCG GTGCTTAGAG GAAGGCAAAC TGGTCAACAA 300 

GTGTGCTTTG GACTTCTTTA GGCAGATAAA ACGTCACTGT GCAGAGCCTT TTACAGAATA 360 

TTGGACTTGC ATTGATTATA CTGGCCAGCA GTTATTTCGT CACTGTCGCA AACAGCAGGC 420 

AAAGTTTGAC HAGTGTGTGC TGGACAAACT GGGCTGGGTG CGGCCTGACC TGGGAAAACT 480 

GTCAAAGGTC ACCAAAGTGA AAACAGATCN ACCTTTACCG GANAATCCCT ATCACTCAAG 540 

AACAAGAACG GATCCCAGCC CTGANATCNA AGGAAATCTG CANCCTGCCA CACATGGCA 599 

<210> 13 

<21l> 597 
<212> DMA 
<213> Hobo sapiens 

<400> 13 

ATATCCGGAG TAGACGGAGC CGCAGTAGAC GGATCCGCGG CTGCACCAAA CACTGCCCCT 60 

CGGAGCCTGG TAGTGGGCCA CAAGCCCCCA GTCCCAGAGC CGTGATTTTC TGGCATCCTT 120 

AAATCTTGTG TCAAGGATTG GTTATAATAT AACCAGAAAC CATGACGGCG GCTGAGAACG 180 

TATGCTACAC GTTAATIAAC GTGCCAATGG ATTCAGAACC ACCATCTGAA ATTAGCTTAA 240 

AAAATGATCT AGAAAAAGGA GATGTAAAGT CAAAGACTGA AGCTTTGAAG AAAGTAATCA 300 

TTATGATTCT GAATGGTGAA AAACTTCCTG GACTTCTGAT GACCATCATT CGTTTTGTGC 360 

TACCTCTTCA GGATCACACT ATCAAGAAAT TACTTCTGGT ATTTTGGGAG ATTGTTCCTA ' 420 

AAACAACTCC AGATGGGAGA CTTTTACATG AGATGATCCT TGTATGTGAT GCATACAGAA 480 

AGGATCTTCA ACATCCTAAT GAATTTATTC NAAGGATCTA CTCTTCGTTT TCTTTGCAAA 540 

TTGAAANAAA CANAATTGCT AAAACCTTTA ATGCCAHCTA TNCCTGCATT TTTGGGA 597 
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<210> 14 

<211> 634 
<212> DNA 
<213> Homo sapiens 

<400> 14 

AGACTCTCAC CGCAGCGGCC AGGAACGCCA GCCGTTCACG CGTTCGGTCC TCCTTGGCTG 60 

ACTCACCGCC CTCGCCGCCG CACCAIGGAC GCCCCCAGGC AGGTGGTCAA CTTTGGGCCT 120 

GGTCCCGCCA AGCTGCCGCA CTCAGTGTTG TTAGAGATAC AAAAGGAATT ATTAGACTAC 180 

AAAGGANTTG GCATTAGTGT TCTTGAAATG AGTCACAGGT CATCAGATTT TGCCAAGATT 240 

ATTAACAATA CAGAGAATCT TGTGCGGGAA TTGCTAGCTG TTCCAGACAA CTATAAGGTG 300 

ATTTTTCTGC AAGGAGGTGG GTGCGGCCAG TTCAGTGCTG TCCCCTTAAA CCTCATTGGC 360 

TTGAAAGCAG GAANGTGTGC GGACTATGTG GTGACAGGAG CTTGGTCAGC TAAGGCCGCA 420 

RAANAAGCCA AGAANTTTGG GACTATAAAT ATCGTTCACC CTAAACTTGG GAGTTATACA 480 

AAAATTCCAG ATCCAAGCAC CTGGAACCTC AACCCAGATG CCTCCTACGT GTATTATTGC 540 

GCNAATGAAA CNGTGCATGG TGTGGAHTCT GACTTTATAC CCGATGTCNA GGGAACATAC 600 

TGGTTTGTGA CATGTCCTCA AACTTCCCGT CCNA 634 

<210> 15 

<211> 757 
<212> DNA 
<213> Homo sapiens 

<400> 15 

AGTCTGCGGT GGGCTANCGG ACGGTCCGGC TTCCGGCGGC CGTTTCTGTC TCTTGCTGGC 60 

TGTCTCGCTG AATCGCGGCC GCCTTCTCAT CGCTCCTGGA AGGTCCCGAG CGCGACACCA 120 

TGTCGGAACC CGGGGGCGGC GGCGGCGAAG ACNGCTCGGC CGGATTGGAA GTGTCGGCCG 180 

TGCAKAATGT GGCGGACGTG TCGGTGCTGC ANAAGCACCT GCGCAAGCTG GTGCCGCTGC 240 

TGCTGGAGGA CGGCGGCGAA GCGCCGGCCG CGCTGGAGGC GGCGCTGGAG GAGAAGAGCG 300 

CCCTGGAGCA GATGCGCAAG TTCCTTTCGG ACCCGCACGT CCACACGGTG CTGGTGGAGC 360 

GCTCCACGCT CAAAGTGGAC GTCGGTGATG AAGGAGAAGA AGAAAAAGAA TTCATTTCCT 420 

ATAACATCAA CNTAGACATT CACTATGGGG TTAAATCCAA TAGCTTGGCA TTCATTAAAC 480 

GTACTCCCGT GATTGATGCA GATAAACCCG TGTCTTCTCA NCTCCGGGTC CTTACACTCA 540 
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GTGAANACTC NCCCTACSAA AACTTTGCAT TCTTTCATTA ACAATGCAGT GGCTCCTTTT 600 

TTTAANTCCT ACATTAAAAA ATCTGGCAAG GCAAACAGGG ATGGTGATAA AATGGCTCCT 660 

TCCNTTGAAA AAAAAATTGC CGAACTCHAA ATNGGACTCC TTCCCTTGCA HCAAAATTTT 720 

TGAAATTCCG GAAAATCANC CTGCCCAATT CCTCCCC 757 

<210> 16 

<211> 300 
<212> MA 
<213> Homo sapiens 

m 

<400> 16 

ATCATTTCCT TATTTATATT TCATGTTGGA ATGCTTAAAT CGATAACCTT TGTATTTTGA 60 

AGTGCGCGAC ATGGAAGGTG ATCTGCAAGA GCTGCATCAG TCAAACACCG GGGGATAAAT 120 

CTGGATTTGG GTTCCGGCGT CAAGGTGAAG ATAATACCTA AAGAGGAACA CTGTAAAATG 180 

CCAGAAGCAG GTGAANAGCA ACCACAAGTT TAAATGAAGA CAAGCTGAAA CAACGCAAGC 240 

TGGTTTTATA TTAGATATTT GACTTAAACT ATCTCAATAA AGTTTTGCAG CTTTCACCAC 300 

<210> 17 

<211> 313 
<212> DNA 
<213> Hobo sapiens 

<400> 17 

AAAGATGGCG GCGGGGGAGG TAGGCAGAGC AGGACGCCGC TGCTGCCGCC GCCACCGCCG 60 

CCTCCGCTCC AGTCGCCTCC GGTCCTTCAA ACTCACACCT CCCGGGAGGA GCTGTCCTGG 120 

CGCCGGGTCC CGCGGGGAAA ATGGTGGAGC CAGGGCAAGA TTTACTGCTT GCTGCTTTGA 180 

GTGAGAGTGG AATTAGTCCG AATGACTCTT TGATATTGAT GGTGGAGATG CANGGCTTGC 240 

AACTCCAATG CCTACCCCGT CAGTTCAGCA NTCAGTGCCA CTTAHTGCAT TANAACTAHG 300 

TTTGGAGACC GAA 313 

<210> 18 

<211> 667 
<212> DNA 
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<213> Homo sapiens 
<400> 18 

ACTGCCGGGC TCGGCGTGAG TCGCTGCGGG GCTGACGGGG TGGCAGTGCG GCGGGTTACG 60 

GCCTGGTCAG ACCATAATGA CTTCAGCAAA TAAAGCAATC GAATTACAAC TACAAGTGAA 120 

ACAAAATGCA GAAGAATTAC AAGACTTTAT GCGGGATTTA GAAAACTGGG AAAAAGACAT 180 

TAAACAAAAG GATATGGAAC TAAGAAGACA GAATGGTGTT CCTGAAGAGA ATTTACCTCC 240 

TATTCGAAAT GGGAATTTTA GGAAAAAGAA GAAAGGCAAA GCTAAAGAGT CTTCCCCAAA 300 

ACCANAGAGG AAAACACNAA AAACAGGATA AAATCTTATG ATTATGANGC ATGGGCAAAA 360 

CTTGATGTGG ACCGTATCCT TGATGAGCTT GACAAAGACG ATAGTACCCA TGAGTCTCTG 420 

TCTCAAGAAT CAGAGTCGGA AGAAGATGGG ATTCATGTTG ATTCNCNAAA GGCTCTTGTT 480 

TTAAAAGAAA AGGGCHATAA ATACTTCCAC AAGGAAAATA TGATGAAGCA ATTGACTGCT 540 

ACACNAAAGG CNTGGATGCC GATCCATATN ATCCCGTGTT GCCAACGAAC AHAACNTCCG 600 

CATATTTTAG ACTGAAAAAA TTTGCTGTTG CTGAATCTGA TTGTTATTTAN CANTTGCCT 660 

TGAAATA 667 



Claims 

1 . A method for isolating a full-length cDNA clone, the method comprising: 

(a) determining a nucleotide sequence from the 5'-region of a cDNA clone contained in a cDNA library; 

(b) determining the presence or absence of an initiation codon in the nucleotide sequence determined in (a) 
using an initiation codon prediction program; and 

(c) selecting clones recognized as containing the initiation codon in (b). 

2. The method of claim 1, wherein the cDNA library is constructed by a method for preparing a full length-enriched 
cDNA library. 

3. The method of claim 1 , wherein a cDN A library is constructed by a method comprising a step of modifying Cap of 
mRNA. 

4. A method for constructing a full length cDNA library, the method comprising: 

(a) determining a nucleotide sequence from the 5*-region of a cDNA clone contained in a cDNA library; 

(b) determining the presence or absence of an initiation codon in the nucleotide sequence determined in (a) 
using an initiation codon prediction program; 

(c) selecting clones recognized as containing the initiation codon in (b); and 

(d) combining the clones selected in (c). 

5. The method of claim 4, wherein the cDNA library is prepared by a method for constructing a full length-enriched 
cDNA library. 

6. The method of claim 4, wherein the cDNA library is constructed by a method comprising a step of modifying Cap 
of mRNA. 

7. A cDNA library obtainable by the method of claim 4. 
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